Risk Bounds for Classification Trees under a Margin Condition
نویسنده
چکیده
Risk bounds for Classification and Regression Trees (CART, Breiman et. al. 1984) classifiers are obtained under a margin condition in the binary supervised classification framework. These risk bounds are obtained conditionally on the construction of the maximal deep binary tree and permit to prove that the linear penalty used in the CART pruning algorithm is valid under a margin condition. It is also shown that, conditionally on the construction of the maximal tree, the final selection by test sample does not alter dramatically the estimation accuracy of the Bayes classifier. In the two-class classification framework, the risk bounds that are proved, obtained by using penalized model selection, validate the CART algorithm which is used in many data mining applications such as Biology, Medicine or Image Coding.
منابع مشابه
Margin Adaptive Risk Bounds for Classification Trees
Margin adaptive risk bounds for Classification and Regression Trees (CART, Breiman et. al. 1984) classifiers are obtained in the binary supervised classification framework. These risk bounds are obtained conditionally on the construction of the maximal deep binary tree and permit to prove that the linear penalty used in the CART pruning algorithm is valid under margin condition. It is also show...
متن کاملRisk bounds for CART classifiers under a margin condition
Non asymptotic risk bounds for Classification And Regression Trees (CART) classifiers are obtained in the binary supervised classification framework under a margin assumption on the joint distribution of the covariates and the labels. These risk bounds are derived conditionally on the construction of the maximal binary tree and allow to prove that the linear penalty used in the CART pruning alg...
متن کاملClassification Algorithms using Adaptive Partitioning
Algorithms for binary classification based on adaptive partitioning are formulated and analyzed for both their risk performance and their friendliness to numerical implementation. The algorithms can be viewed as generating a set approximation to the Bayes set and thus fall into the general category of set estimators. A general theory is developed to analyze the risk performance of set estimator...
متن کاملClassification Methods with Reject Option Based on Convex Risk Minimization
In this paper, we investigate the problem of binary classification with a reject option in which one can withhold the decision of classifying an observation at a cost lower than that of misclassification. Since the natural loss function is non-convex so that empirical risk minimization easily becomes infeasible, the paper proposes minimizing convex risks based on surrogate convex loss functions...
متن کاملAdaptive Sampling Under Low Noise Conditions
We survey some recent results on efficient margin-based algorithms for adaptive sampling in binary classification tasks. Using the so-called Mammen-Tsybakov low noise condition to parametrize the distribution of covariates, and assuming linear label noise, we state bounds on the convergence rate of the adaptive sampler to the Bayes risk. These bounds show that, excluding logarithmic factors, th...
متن کامل